Reusing Old Policies to Accelerate Learning onNew

نویسنده

  • Daniel S. Bernstein
چکیده

We consider the reuse of policies for previous MDPs in learning on a new MDP, under the assumption that the vector of parameters of each MDP is drawn from a xed probability distribution. We use the options framework, in which an option consists of a set of initiation states, a policy, and a termination condition. We use an option called a reuse option, for which the set of initiation states is the set of all states, the policy is a combination of policies from the old MDPs, and the termination condition is based on the number of time steps since the option was initiated. Given policies for m of the MDPs from the distribution, we construct reuse options from the policies and compare performance on an m + 1st MDP both with and without various reuse options. We nd that reuse options can speed initial learning of the m + 1st task. We also present a distribution of MDPs for which reuse options can slow initial learning. We discuss reasons for this and suggest other ways to design reuse options.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reusing Risk-Aware Stochastic Abstract Policies in Robotic Navigation Learning

In this paper we improve learning performance of a riskaware robot facing navigation tasks by employing transfer learning; that is, we use information from a previously solved task to accelerate learning in a new task. To do so, we transfer risk-aware memoryless stochastic abstract policies into a new task. We show how to incorporate riskawareness into robotic navigation tasks, in particular wh...

متن کامل

Exploration Strategies for Reusing Past Policies

The balance between exploring new actions and states, or exploiting the knowledge acquired while learning has been widely studied in Reinforcement Learning. There is also a clear interest in how past policies that solve different tasks may help to solve a new one, and it also requires a balance between explore, exploite past policies or to exploit the current one. In this paper, we show that re...

متن کامل

Reusing Learned Policies Between Similar Problems

We are interested in being able to leverage policy learning in complex problems upon policies learned for similar problems. This capability is particularly important in robot learning, where gathering data is expensive and time-consuming, and prohibits directly applying reinforcement learning. In this case, we would like to be able to transfer knowledge from a simulator, which may have an inacc...

متن کامل

Learning and Reusing Goal-Specific Policies for Goal-Driven Autonomy

In certain adversarial environments, reinforcement learning (RL) techniques require a prohibitively large number of episodes to learn a highperforming strategy for action selection. For example, Q-learning is particularly slow to learn a policy to win complex strategy games. We propose GRL, the first GDA system capable of learning and reusing goal-specific policies. GRL is a case-based goal-dri...

متن کامل

Reusing Learning Objects and the Impact of Web 3.0 on e-Learning Platforms

E-Learning promotes the exchange of experiences and knowledge that facilitate the learning of students without the time and space restrictions imposed by traditional models. The potential for reusability is a primary attraction for educators when discussing about learning objects. Reusing learning objects is as old as retelling a story or making use of libraries and textbooks. In electronic for...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999